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Abstract 

We consider a branching population where individuals have i.i.d. life lengths (not nec- 
essarily exponential) and constant birth rate. We let Nt denote the population size 
at time t. We further assume that all individuals, at birth time, are equipped with 
independent exponential clocks with parameter 6. We are interested in the genealogi- 
cal tree stopped at the first time T when one of those clocks rings. This question has 
applications in epidemiology, in population genetics, in ecology and in queuing theory. 

We show that conditional on {T < oo}, the joint law of {Nt,T, X^'^^), where X^^^ 
is the jumping contour process of the tree truncated at time T, is equal to that of 
(M, —Im,YI^) conditional on {M ^ 0}, where : M + 1 is the number of visits of 0, 
before some single independent exponential clock e with parameter b rings, by some 
specified Levy process Y without negative jumps reflected below its supremum; 1m is 
the infimum of the path Ym defined as Y killed at its last before e; Y'^ is the Vervaat 
transform of Ym- 

This identity yields an explanation for the geometric distribution of Nt |1H [16] and 
has numerous other applications. In particular, conditional on {N^ = n}, and also on 
{Nt = n,T < a}, the ages and residual lifetimes of the n alive individuals at time 
T are i.i.d. and independent of n. We provide explicit formulae for this distribution 
and give a more general application to outbreaks of antibiotic-resistant bacteria in the 
hospital. 
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1 Introduction 



We consider a population of particles behaving independently from one another, where each 
particle gives birth at constant rate 6 > during its lifetime (inter-birth durations are i.i.d. 
exponential random variables with parameter b), and where lifetime durations are i.i.d. on 
(0, +oo] (some particles may have infinite lifetimes) with probability distribution fi (not 
necessarily exponential). 

The genealogical trees that we consider here are usually called splitting trees jTj. We 
define the lifespan measure as the measure on (0, +oo] with total mass b simply defined as 
vr := bfi. 

The process {Nf, t > 0) giving the number of extant particles at time t, belongs to a wide 
class of branching processes called Crump-Mode-Jagers processes. Actually, the processes 
we consider are homogeneous (constant birth rate) and binary (one birth at a time) but are 
more general than classical (simple) birth-death processes [8] in that the lifetime durations 
may follow a general distribution. 

In addition, we assume that each particle is independently equipped with a random 
exponential clock with parameter 6 > 0. We are interested in the first time T when one of 
those clocks rings, called detection time. See Figure [T] for a realisation of a splitting tree with 
individual clocks. Note that on the extinction event, T can be infinite (no clock rings) with 
positive probability. 

This question has applications in population genetics and in ecology O |6l [121 E] (T is 
then the first time when a new mutant or a new species arises), in queuing theory [111 El IIS] 
(because is a time-changed processor- sharing queue, and then in the new timescale, T 
is a single, independent exponential clock), and in epidemiology [H HE] (T is then the first 
detection time of the epidemics). In this last setting, ages of individuals in the population at 
T are the times since infection of infectives in the detected outbreak, and in the last section 
we see how this information can be enlarged with more easily available data such as the 
length of stay in the hospital up to time T. 

Our main result is to characterise, on the event {T < oo}, the joint law of {Nt, T, X^^)), 
where X^'^^ is the jumping contour process of the tree truncated at time T, in terms of the 
Vervaat transform of the path of the (reflected) Levy process X with jump measure vr and 

"^Keywords: branching process; splitting tree; Crump-Mode-Jagers process; contour process; Levy pro- 
cess; scale function; resolvent; age and residual lifetime; undershoot and overshoot; Vervaat's transformation; 
sampling; detection; epidemiology; processor-sharing. 
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60J85; 60G17; 60G51; 60G55; 60K15; 60K25. 
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Figure 1: A realisation of a splitting tree with individual exponential clocks. Time flows 
from bottom to top; horizontal dashed lines show flliation. Solid dots show individual ringing 
clocks. The time T when the flrst clock rings is indicated. Here A^^ = 7. 



slope —1. In particular, we recover the known fact [HI [16] that conditional on {T < oo}, 
Nt is geometrically distributed, and we characterise the joint law of T and Nt in terms of 
(joint) Laplace transforms of some hitting times of X. As a further example, restricting 
the main identity to the undershoots and overshoots of X whenever it crosses 0, we get the 
following application. Conditional on {Nt = n}, and also on {Nt = n,T < a}, the ages 
and residual lifetimes of the n alive individuals at time T are i.i.d. and independent of n 
and follow the bivariate law of {Und,Ove) (resp. {Unda,Ovea)) deflned hereafter. The pair 
{Und,Ove) (resp. {Unda,Ovea)) is the undershoot and overshoot of the jump across of 
X, at its flrst hitting time of (0, +oo], conditional on Tq smaller than some independent 
exponential time with parameter 6 (resp. and infQ<^<^+ X^ > —a). In the epidemics model, 
these statements are extended by taking into account, in addition to the age and residual 
lifetime (of individual infection at time T), the length of stay in the hospital up to infection 
time. In all cases, explicit formulae are also provided for these laws. 



2 Splitting trees and Levy processes 

We assume that splitting trees are started with one unique progenitor born at time 0. We 
denote by P their law, and the subscript s in Fg means conditioning on the lifetime of the 
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progenitor being s. Of course if P bears no subscript, this means that the hfetime of the 
progenitor follows the usual distribution /i. 

In [13], for t > 0, the first author has considered the so-called jumping chronological 
contour process (JCCP), here denoted X^^\ of the sphtting tree truncated up to height 
(time) t, which starts at s At (here and in what follows x Ay denotes the minimum of x and 
y), where s is the time of death of the progenitor, visits all existence times (smaller than t) 
of all individuals exactly once and terminates at (see Figure [2]). 



a) 
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Figure 2: a) A splitting tree; b) The jumping chronological contour process associated with 
the same splitting tree after truncation at time t. Here Nt = 9. 

He has shown [131 Theorem 4.3] that the JCCP is a Markov process, more specifically, it 
has the same law as the compound Poisson process X with jump measure vr, compensated 
at rate —1, reflected below t, and killed upon hitting 0. 
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We denote the law of X by P, to make the difference with the law P of the CMJ process. 
As seen previously, we record the lifetime duration, say s, of the progenitor, by writing P, 
for its conditional law on Xq = s. To stick to the analogous notation for P, it will be implicit 
in the absence of subscript that Xq under P has probability distribution /j,. 

Let us be a little more specific about the JCCP. Recall that this process visits all existence 
times of all individuals of the truncated tree. For any individual of the tree, we denote by 
a its birth time and by u its death time. When the visit of an individual v with lifespan 
{a{v),uj{v)] begins, the value of the JCCP is u{v). The JCCP then visits all the existence 
times of f 's lifespan at constant speed —1. If f has no child, then this visit lasts exactly the 
lifespan of v; if v has at least one child, then the visit is interrupted each time a birth time of 
one of f 's daughters, say w, is encountered (youngest child first since the visit started at the 
death level). At this point, the JCCP jumps from a{w) to u{w) At and starts the visit of the 
existence times of w. Since a truncated tree has finite length, the visit of v has to terminate: 
it does so at the chronological level a{v) and continues the exploration of the existence times 
of f 's mother, at the height (time) where it had been interrupted. This procedure then goes 
on recursively and terminates as soon as is encountered (birth time of the progenitor). See 
Figure |2] for an example. 

Note that the genealogy of a splitting tree truncated at time t can be coded by associating 
each individual with a word of integers, such that is the root, 1 is the last daughter of 
the root born before time t, 2 is the next before last daughter of the root,..., 11 is the last 
daughter of 1 (born before time t), and so on. Then the order in which individuals get their 
first visit by the contour process is the lexicographical order associated with this (so-called 
Ulam-Harris-Neveu) labelling. Roughly speaking, if u and v are two distinct finite words of 
integers, and that hu (resp. h^) is the first integer in the word u (resp. v) coming immediately 
after their longest common prefix, then u comes first in the lexicographical order if and only 
if hu < hy. Here we assume that ii h ^ 0, then h > 0. 

Since the JCCP is Markovian (as seen earlier, it is a reflected, killed Levy process), its 
excursions between consecutive visits of points at height t are i.i.d. excursions of X away 
from [t, +oo]. Observe in particular that the number of visits of t by X is exactly the number 
Nt of individuals alive at time t. Therefore, it is easy to see that Nt has a shifted geometric 
distribution with parameters specified as follows. Let denote the first hitting time of the 
set A by X. We also use the following shortcuts 

Tx-=T{x} and r+ := r(a;,+oo], 
that is, Tx is the first hitting time of x and is the first hitting time of the open interval 
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(x, +00]. Then conditional on the initial progenitor to have lived s units of time, we have 



P,(iV, = 0) = Ps{to < r+) 



(1) 



and, applying recursively the strong Markov property. 



P,(iV, = n I AT, ^ 0) = Pi(r+ < ro)"-ip<(ro < r+). 



(2) 



Note that the subscript s in the last display is useless. Also note that the spatial homogeneity 
of Levy processes implies Pt(ro < t/) = Po{T_t < Tq)- 

In addition, exact formulae can be deduced for ([1]) and ([2]) from the fact that the JCCP 
is a Levy process with no negative jumps, using scale functions of the Levy process X. This 
part is developed in the final section of the paper. 

Later, we see that that the population size is not only (conditionally) geometric at fixed 
times, but also at the first detection time T, using the same decomposition of the contour 
process into excursions away from (T, +00]. This decomposition is displayed in Subsection 
3.1 of Section 3 'Main results'. Subsections 3.2 and 3.3 are devoted to path decompositions 
providing equalities in law for the whole contour process of the tree stopped at T, involving 
in particular Vervaat's transformation: see Theorems [21 |5] and especially |6] for the result 
stated in the abstract. Section 4 focuses on the joint distribution of T, Nt and of the ages 
and residual lifetimes of the Nt alive individuals at time T. Subsections 4.2 and 4.3 provide 
explicit formulae (up to scale functions of the Levy process X) for these distributions. The 
reader interested in applications can go straightforward to the last statement of Section 4, 
Proposition [11] Finally, Section 5 extends these results to the example of a pathogen out- 
break in the hospital modeled by a Crump-Mode- Jagers process with constant transmission 
rate b and i.i.d. (infection) lifetimes, but also taking into account the length of stay in the 
hospital up to infection. 

In the rest of the paper, we always use the following notation, where E can be any 
expectation operator. A, B any events and Z any (positive or integrable) random variable 
E{Z,A,B) ■.= E{ZlAnB). 

2.1 Intuition for the geometric distribution 

In this section, we show how to gain insight from the equivalence of the splitting tree and 
the corresponding contour process, as visualised in Figure [2l and in particular to give the 
intuition why the number of individuals at the first detection time is geometrically distributed 
pT| [T6] . This intuition also gives the main ideas behind the rigorous proofs below. 
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We consider the event that Nf = n {n > 0) and no detection has occurred yet at time t, 
i.e., the event {Nt = n,T > t}. This event occurs if the following events successively occur 

1. The Levy process X following the contour of the tree truncated below time t starts 
with a typical jump (distributed according to /i) and hits the interval (t, oo] before 
it hits again, and during this time no clock rings. The probability of this event is 



2. the process X started at t, makes an excursion ending in the interval (t, oo] without 
hitting 0, and no clock rings during this excursion. Since the contour process of the 
tree truncated below t is started again at t, independently from the past, n — 1 such 
events occur successively, and each of them occurs independently with probability 



3. X starts at t and reaches before hitting the interval (t, oo], and during this time no 
clock rings. This happens with probability Et{e^^'^°,r^ > Tq). 

The next step is to "glue" the path described in the third event above before the path 
described in the first event above, into one excursion with infimum equal to and in which 
no clock rings. This is basically the inverse of Vervaat's transformation, see Figure [3] below, 
where the inverse is constructed in such a way that the infimum of the whole process is 
performed during this newly created (first) excursion. 

Since X jumps at rate b and has slope —1 (and also is translation invariant), multiplying 
the probability of this concatenated path by b dt gives the probability of an excursion of 
X away from [0, +oo) without ringing clock and with infimum in (— t — dt,—t). Then, 
b F{Nt = n,T > t) dt is the probability that X makes n excursions away from [0, +oo) 
without making a clock ring, and that the infimum of the whole path is performed in the 
first excursion and belongs to {—t — dt, —t). This yields 



E{e-'^^ ,r+<ro); 



Et{e-'^^ ,r+<ro); 



b F{Nt = n,T > t) dt = Eo{ 





(3) 



Now observe that 



F{Nt = n,T edt) = 6n F{Nt = n,T > t) dt 



6 d 
bdt 



)j dt 
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Integrating over t now gives 




A little elaboration on this argument also gives that the distributions of the ages and 
residual lifetimes at time T should be i.i.d. and independent of Nt- Precise proofs are given 
in the sections below. 

3 Main general results 

3.1 Decomposition of the splitting tree at first detection time 

In this subsection, we call any cadlag (continue a droite, avec des limites a gauche, i.e., right- 
continuous with left limits [8^ p. 346]) path e with lifetime V{e) G [0,+oo], an excursion. 
We use the notation S' for the space of excursions, endowed with Skorokhod's topology and 
the associated Borel cr-field. 

For any time t > 0, we set pt the first exit time of (0,t) and we let Wq denote the finite 
path of the JCCP X^*) killed upon exiting (0,t), that is, := (Xi*-*; s < pt). Further, on 
the event X^ = n > 1, for i = 1, . . . , n, we set 0", the i-th. hitting time of t by X*^*) and we let 
w\ denote the path of the JCCP X*^*) between times (jj and CTj+i, with the convention that 
(Tn+i = To, that is, 

w\{s) := Xf5a, s < ai+i - ai. 

For 2 = 0, . . . , n, we denote by £j := = cTj+i — ai the lifetime of the excursion w\ (so 

that Wn{(-n) = 0), and for i = 0, . . . , n — 1, we record the size of the jump made by the contour 
process before reflection by setting w\{li) equal to the date of death of the individual alive 
at t visited at time o"j+i. In particular, £o is the life length of the progenitor, and if £o > 
then Wq is reduced to the one-point process that maps to £o- 

In particular, when t is fixed, we know [13] that conditional on {Nt = n}, 

• the excursion Wq follows the law of X started at a jump distributed as p, killed at 
and conditioned on r^"*" < Tq; 

• the excursions wj, i = 1, . . . ,n — l, are i.i.d. and follow the law of X started at t, killed 
at and conditioned on r^^ < tq; 

• the excursion follows the law of X started at t, killed at tq and conditioned on 

^0 < n^- 
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Now recall that all individuals are equipped with an independent exponential clock with 
parameter 5, and that the time when the first of those clocks rings is denoted by T and 
called detection time. 

Proposition 1. Let p := P(T < oo) be the probability that at least one clock rings before 
extinction of the population. Then 

p^ E(l-e-^^'). 

More specifically, 

¥{Nt = 0, T > t) = ¥{Nt = 0, T = oo) = ^(e"^"", tq < r+) 
and for any n > 1, 

F{Nt ^n,T>t)^ E{e-'<,Tt < Tq) (^t(e-^^*^T+ < To))""' ^t(e-^"°,ro < r+). 



Further, let n > 1 and G', G, Fq, Fi, . . . , F^-i be non-negative, measurable functions on S . 
Then 



n— 1 



i=l 



E ( G"(wt n ^i(^t)' ^T = n,T ^ dtj = 

5ndtE (g'(X,; s < r+) e'^^^ , t+ < Tq) x 

'n-1 ^ 

X <j H EtiFiiXs-, s < T+) e-'^*\r+ < tq) \ Et (^(X,; s < tq) 6"^^°, tq < t+) . 



i=l 



Proof. In the whole proof, let e denote an independent exponential r.v. with parameter 5. 
Since the JCCP has slope —1 and jump sizes equal to life lengths, the lifetime Tq of the 
contour process is exactly the sum of the lifespans of all individuals in the population. As a 
consequence, 

p ^ P(to > e) ^ E{1 - e-^^°). 

Applying this property to the truncated contour process X*^*) and using the path decompo- 
sition preceding the statement of the proposition, we get 

F{Nt ^n,T>t) = F{Nt = n, To(X(*)) < e) = E (^e'^^^^^^'^) , Nt = 

= P(r+<ro)i?(e-^^«)) (p,(r+ < To) (e-^^^"'?)))""' x 
X Pi(ro<r+)E(e-^^(<)), 
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which yields the desired expression. 

Observing that the detection rate equals 5n conditional on {A^^ = n}, we finally get 

/ n-1 

E j G"(w;°)G«) JJ Fi{w\), Nt = n,T e dt 
\ i=i 

6ndtE ^G'{w^)G{w'^)Y[Fi{wi),Nt = n,T > tj , 

and the desired equality follows by the same method as previously. □ 

Remark 1. Since the knowledge of the contour of the genealogical tree yields that of the 
tree itself, the previous proposition characterises the law of the splitting tree stopped at the 
first detection time (noting that conditional on Nt = n, the marked individual is of course 
uniform among all n alive individuals) . 

3.2 Rephrasing with i.i.d. excursions 

In this subsection, e denotes an excursion distributed as X started at and killed upon 

hitting (0, +oo]. Recall that e only takes negative values, except at (e(0) = 0) and at V, 
since eiV) > on the event {V < oo} (on the complementary event, e drifts to — oo). 

Set j(e) := infse(s). On the event {] ^ — oo} (which coincides a.s. with {V < oo}), we 
denote by h{e) the unique time h such that e{h—) = j. Also, we denote by the pie-h 
process and by the post-/i process: 

e^{s) := e{s) 0<s< h{e), 

with e^(/i(e)) = e^{h{e)-) = j(e) and 

e-^{s):=e{s + h{e)) < s < V{e) - h{e). 

Notice that with positive probability V{e) = h{e), so that is then reduced to the one-point 
process that maps to eiV). 

Let n > 1 and ei, . . . , e„ denote i.i.d. excursions distributed as e. Set 

In := minj(efc) and := argminj(efc). 

k k 

The next result is a consequence of the following two lemmas and Proposition [TJ 
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Theorem 2. Let G, G', Fi, . . . , F„ be non-negative, measurable functions on S . Then 



n-l 



E G'{w\)G{wf) FiK), Nt^u.T ^dt\^ 



1=1 



^ . £; [ G(6^^ + t) G'(e- + t) n -<^h(^'^ + ^) n ^ ■ 

\ feT^JSTn fe=l / 

Remcirk 2. Note that the expression inside the expectation in the rhs has zero probability 
when one of the excursions has infinite lifetime, that is, when there is some k such that 
V{ek) = +00. 

Lemma 3. Let G and G' he two non-negative, measurable functions on <f . Then 

E{G{e^) G"(e^), -j e dt) = bdt Eo{G{Xs; s < T^t).T-t < r^) E{G'{Xs-t- s < r+),r+ < tq). 

Proof. Applying the strong Markov property at T_t yields 

E{G{e^)G'{e-^),-jedt) = E(G(e^) G"(e^), -j e dt, r_i < ro+) 

= Eo{G{X,;s<T^t),r_t<r^)x 

X dt [ 7T{dy)Ey_t{G'{X,;s<T^),T+ <T-t), 

which yields the result. □ 

Lemma 4. Let G,G', Fi, . . . , Fn be non-negative, measurable functions on S. Then for 
j ^l,...,n, 

E{G{e^JG'{e]^J H F,{e,),-I^ e dt,K^^ j) ^ 
bdtEo{G{Xs;s< T.t),r-t < t+) E{G'{Xs-t;s < T+),r+ < tq) l[Eo{Fk{Xs;s < T+),r+ < t_ 

Proof. The expression in the Ihs equals 

E{G{e^jG'{e^J H F,{ek),-In e dt,K^^ j) 

= £;(G(ej-) G'(e-), -j(e,) e dt)E{Y[Fk{ek), -j(efe) <t\Jk^ j) 

k^j 

= E{G{e^)G'{e^),-j{ej) e dt)l[Eo{Fk{X,; s < 4),^ < T_t), 
and the conclusion stems from the previous lemma. □ 
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3.3 Rephrasing with Vervaat's transformation 

Forgetting about the terminal jump of each excursion (piece of information that actually is 
useful in the next section), Theorem [2] can be expressed in a more elegant way. 

For any cadlag path Z with finite lifetime ViZ^ and law locally absolutely continuous 
w.r.t. X, we set liZ^ := inf Z and we define H{Z) as the unique time t such that Z(t—) = 
I{Z). Finally we let Z' denote Vervaat's transform of Z, defined as the path with lifetime 
V{Z) such that Z'{y{Z)) = and 

Z'{s) = Z{s + H{Z) mod{V{Z))) - I{Z) < s < V{Z). 




Figure 3: Vervaat's transformation. Top panel: A path Z with finite lifetime V{Z) perform- 
ing its infimum I{Z) at time H{Z)—; Bottom panel: Vervaat's transform Z' of Z obtained 
by shifting Z by —I{Z) and performing a circular time-change starting at time H{Z). 
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More specifically, 

( Z{s + H{Z))-I{Z) if < s <V{Z) - H{Z) 

Z'{s):=< Z{s + H{Z)-V{Z))-I{Z) if V{Z) - H{Z) < s <V{Z) 
y Z'{s) =0 if s = V{Z). 

Note that Z' takes positive values, apart from its terminal value equal to (and that Z' is 
left-continuous at this point). 

Now let Yn denote the concatenation of the n i.i.d. excursions (ej)j=i^...^„ of the last sub- 
section. In particular, Yn is equally distributed as the Levy process X reflected below its 
supremum and killed at its [n + l)-st hitting time of 0. Then observe that /„ = I{Yn) and 
let denote Vervaat's transformation of Yn- We have the following corollary of Theorem O 

Theorem 5. For any n>l, 

r 

W{NT = n,T e dt, XC^) ede) = - e'^^^") P (-/„ G dt, Y^ G de) . 
Proof. From Theorem [2] we get 

(n-l 
G"(w°)G«) W Fi{wi), NT = n,Tedt 
i=i / 

^ . E Icie^^ - 4) G'{e]t^ - J„) J] w(n)(efc - Q e'^^^^-), -J„ e dtV 

\ k^Kn J 

which, by a monotone class theorem fiU[ p. 2], ensures that for any non- negative measurable 
function F on S', 

E NT = n,Tedt) = -E (f(F^) e'^^^^-), -In G dt^ , 

hence the result. □ 

One can now push this path decomposition even further by starting from a path, say 
Y, of X reflected below its supremum as well as from an independent exponential random 
variable e with parameter 6. Note that Y is the mere concatenation of a sequence (ej)i>i of 
i.i.d. excursions distributed as e (stopped at the flrst one with inflnite lifetime). Then let M 
be the unique non-negative integer such that e falls into the M + 1-st excursion of Y away 
from 

M := max{n > : V{ei) H h V"(e„) < e}. 
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with the usual convention that an empty sum is 0. As previously, define Ym as the path 
Y killed at its M + 1-st hitting time of 0, set Im '■= inf and let Yl^ denote Vervaat's 
transform of Ym- 

Theorem 6. For any n > 1, 

¥{NT = n,T edt,X^'^ e de) = — - P {M = n, -Im e dt,Yi^ e de) . 

Proof. Using the definition of M, we get 

PiM = n,-lMedtX4^de) = P (^(F^) < e < + r(e„+i), -/„ G rft, G rfe) 

= e-^^(^) E{1- e-'^^(^"+i)) P (-/„ G dt, Y^ G de) 
= e-^^(^) Eo{l - e-^"'^) P (-/„ G dt, Y; G de) , 

and an appeal to Theorem |5] yields the result. □ 



4 Applications and explicit formulae 

4.1 Some lower dimensional marginals of interest 

In this subsection, we give the joint law of T and N-r, as well as the joint law of the ages and 
residual lifetimes {Ai,Ri, . . . ,Aisf^, Rnt) of the Nt ahve individuals at time T on the event 
{T < oo}. 

The next statement follows from Theorem[2]by taking all functionals equal to 1. Note that 
can be read as Eq ^'^o ^ — infQ^^^^+ Xg < t^, so it is differentiable 

with derivative equal to Eq (^e^^'^o ^ — infQ<^<^+ Xs G dt^ /dt. 

Corollary 7. Let n> 1 and y,t > 0. The joint law ofT and Nt is given by 

F{Nt = n,T e dt) = ^ . E (f[e-^^^"^\ -/„ G dt 

\k=l 



Integrating the variable t over (0, y) yields 

6 



inf X,edt] (Eo(e-'^o^T+<T_t 

0<S<Tfl J ^ ^ 



F{Nt = n,T<y) = -. (Eq [e''^" < r 



y ' 
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and finally, letting y — oo, we get 

6 



F{NT = n) = -. (^Eo[e-'^^)y 



The next statement follows from Theorem[2]by reducing the functionals to functions of the 
bi-variate random variable {—e{V{e)—), e(y{e))), known as the undershoot and overshoot of 
e at its first up-crossing of the a;-axis. Indeed, recall that the age Ai and the residual lifetime 
Ri of the i-th individual in the population at time t in the order of the contour, are seen 
directly on the JCCP as the undershoot and overshoot of across t (which occurs at time 
V{wl), with terminal value equal to the date of death of this i-th individual). We use the 
notation Und{e) := —e{V{e)—) and Ove{e) = e{V{e)) for the undershoot and overshoot of e. 

Corollary 8. The joint law of the ages and residual lifetimes {Ai, Ri, . . . , Ajy^, Rn^) of the 
Nt alive individuals at time T is given by 

F{Nt = n,T & dt, Ai G dai, Ri G dvi, i = 1, . . . ,n) = 

- . E ( JJe~''^("=\ -4 G dt,Und{ei_i+K„ mod{n)) e dai,Ove{ei_i+K„ mod(n)) e dri,i = 1,...,: 

\k=l 

nS 



Eo \ e "^^0 inf Xs G dt, -X+ G dai,X + e dn \ x 

0<s<r+ ^° J 

n 

Yl Eo (e-^"o+, G dak, G dvk, t+ < 



n 

X 

k=2 



Recall from Theorem IH] (or observe from the last statement) that Ai and Ri are the 
undershoot and overshoot of the excursion where the infimum is performed. In order to lose 
this information (which is certainly not in the hands of who observes the beginning of the 
epidemics), we reshuffle the labels of the individuals at T, on the event {T < oo, Nt = n}, 
by drawing independently a uniform permutation ^ on {1, . . . , n} and setting 

{A'i, R'i) ■■= {A,^^), R,i^)) i = l,...,n. 

The first equality in the next statement is a mere reformulation of Corollary 4.2 using the pre- 
vious definition. The integration part comes from the same argument as the one mentioned 
before Corollary 4.1, i.e., by writing the event {r^ < r_i} in the form {— infQ<^<^+ Xs < t}. 

Corollary 9. The joint law of the (reshuffled) ages and residual lifetimes {A[, R[, . . . , A'j^^, R'j^^) 
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of the N'T alive individuals at time T is given by 



W{Nt = n,T e dt, A[ e dai, R[ E dvi, z = 1, . . . , n) = 

^ . i e'^'^o ^ - inf ^Xs e dt, -X^+^ G da^, X^+ e dri) x 



b 

i=l 



Yl Eo (^e ^^0^ -Xr+- e dak, e dr^, r+ < T_t) ■ 



X 



Integrating the variable t over (0, y) yields 

P(iVr = n,T < y,A[ e dai, R'i e dvi, i = l,...,n) 

6 



k=l 

and finally, letting y ^ oo, we get 



- . e dak,X^+ e drk,T^ < r^y^ 



W{Nt = n, A[ e dai, R'i e dri, i = 1, . . . ,n) = - . JJ (e~^''° , e dak, e dr^ . 

k=l 

Remark 3. We observe that conditional on Nt = n and/ or conditional on Nj- = n and 
T <y, the random pairs {Ai,Ri), 1 < i < n, are i.i.d. and their common distribution does 
not depend on n. 

4.2 Completely asymmetric Levy processes 

Here, we seek to provide the reader with more exphcit formulae regarding the quantities 
considered in Subsection 14.11 taking advantage of background knowledge on Levy processes. 
Except the proposition stated at the end of the present subsection, all results stated here 
and the references to their original contributors, can be found in [21 13]. 

Instead of the jump measure vr of the Levy process X with no negative jumps, it can be 
convenient to handle its Laplace exponent ■?/' defined as 



ilj{a):=a- 7r(rfx)(l - e"""^) a > 0. (5) 

"'(0,+oo] 

Recall that the real number 7r({oo}) can be positive, since particles may have infinite life- 
times. It is also the killing rate of X. The function is differentiable and convex and we 
denote by rj its largest root. Then ip is increasing on [rj, +oo) and we denote by (p its inverse 
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mapping on this set. Furthermore, the so-called two-sided exit problem (exit of an interval 
from the bottom or from the top by X) has a simple solution, in the form 

where the so-called scale function W is the non-negative, nondecreasing, differentiable func- 
tion such that W{0) = 1, characterised by its Laplace transform 

1 

dx e-"'' Wix) = —— a>7]. (7) 

Equation ([6]) gives the probability that X exits the interval (0, t\ from the bottom. The 
following formula gives the Laplace transform of the first exit time pt := Tq A on this 
event. For any g > 0, 

where the so-called g-scale function W'^'^^ is the non-negative, nondecreasing, differentiable 
function such that W^''\Q) = 1, characterised by its Laplace transform 

dxe-"'' W^''\x) = — a > 0(g). (9) 

^(a) - g 

The g-resolvent of the process killed upon exiting (0, t] is given by the following formula 

{s,ye{0,t]) 

uKs,y) := e-^n{x.^,y}dv^ /dy = ^^'^^^^c,)!^^"^^^^ - My>s}W^'\y ' (10) 

We also need the g-resolvent of the process killed upon exiting (— oo, 0] (s, y >0) 

u^is^y) := E_, [ H e-^n^^x.^.y^dv] /dy = e'^^^)^ W^'^\s) - l^.^^yW^^Xs - y). (11) 



Last, we have the following expression for the bi-variate law of the undershoot and overshoot 
on the event that the process exits (0, t] from the top 

E, [e-^''\T+ < ro,Xp,_ G dy,Xp^-Xp^_ e dz) = ul{s,y) dy^{dz) z + y>t,ye (0,t), 

(12) 

and the analogue for the exit from (— oo, 0]. 
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The next statement deals with the following quantities of interest in relation to Subsection 
14.11 For any t > and q > 0, set 

Gg{t):=l-Eo(e-''^\T+ <T^ty 

In particular, as t — t- oo, Gq{t) converges 
Proposition 10. For any q,r > and < a < t, 

Eo (e-'^^o^ -^r+- ^ da, e dr^ = e--^^^^" da dnia + r 



/ + \ W'^'^\t — a) 

Eo (e~''''o , -X^+_ e da, e dr, < r_t j = ^,(q)(^^^ da d7r{a + r) 

and 



For any q,t > 0, 



l + qjlW^^\s) ds 



and 



G,(oo) 



Proof. The first two displays stem from instantiating (11 01) (resp. ( ITTl) ) and ( fT2l) at s = t, 
using the spatial homogeneity of Levy processes (resp. s = 0). 

Writing 7f(x) := 7r((x, +oo]), a; > 0, we get from the first display 

/ + \ W^'^Ht — a) 

Eo (^e-^^o , _x^+_ e da, < r_t) = ^(^^^^^ da7r{a), 

so that 

where ^ 

gg{t):= ! daW^''\t- a)Ti{a) t > 0. 
Jo 

To prove the third display, it remains to prove that for all t > 0, 

W^''\t)-gg{t) = l + q [ W'^''\s)ds. 

Jo 
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Now for any A > 0(g), the Laplace transform of the non- negative mapping hq : t ^ 9qit) + 
1 + g Jp W^''\s) ds evaluated at A is 

/ dte-^'hJt) = T + ^T-. / dte-^' n(dr) + q dt e-^' dsW^'^Hs) 

Jo A V^(A) -q Jo J{t,+oc] Jo Jo 

1 1 f , , . l-e-^' , ^e-^' 



A ^(A) - q i(o,+oo] A 
1 , A-^(A) ^ q 



A I ^(A)-g ^(A)-gJ ^(A)-g' 

which is the Laplace transform of W^'^'' and characterises it. 

The last two equalities are classical results in fluctuation theory of Levy processes [2]. 
To be more specific, the first equality is the well-known fact that the inverse mapping of the 
Laplace exponent of a Levy process without negative jumps is the Laplace exponent of its 
dual ladder time process. Since q h-). Gq(oo) = Eq (^1 — e~^'^o' j is the ladder time process of 
X, the Wiener-Hopf factorisation yields the second equality (which could also be proved in 
the same fashion as the third display, using the second display). □ 



4.3 Summary statement with explicit formulae 

The analytical results of Proposition [10] can be applied straightforwardly to rephrase the 
conceptual results of Subsection 14. H at the preference of the reader. The next statement is 
one of the practical ways of doing this. It provides explicit formulae, up to the knowledge (or 
numerical computation) of (5-scale functions (occasionally via Gs, but then use Proposition 
[To]) and (which is fast to compute as the inverse mapping of for various marginals of 
interest of the splitting tree stopped when the first clock rings. 

Proposition 11. Let n > 1 and y,t > 0. The joint law of T and Nt is given by 
F{Nt = n,Tedt) = -^ . G'sit) (1 - G5(t))""' dt. 

As a consequence, 

F{NT = n,T<y) = ^- (1 - Gs{y))\ 

b 

with respective one- dimensional marginals 
F{Nt = n) = - (1 - Gsioo)) 



In particular, 

F{Nt = n\T = t)= nG^sit) (1 



b [ m 



Gsit)r-' and = n \ T < y) = Gsiy)il-Gsiy)) 



n-l 
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Also, the probability p that T < oo equals 

_6 1 - Gs{oo) _ (j){6) - S 
^~b' Gsioo) ~ b ■ 

Conditional on {Nt = n,T < y}, (y < oo) the ages and residual lifetimes of the n alive 
individuals at time T are i.i.d., distributed as the r.v. {A{y),R{y)) (independent of n). If 
y <oo, 

1 W^^Xv — a) 
P(A(y) e da, R{y) e dr) = ^ _ ^^^^^ w(^){y) '^"^ ^ '''' 

— aaa7r[a + r). 



W(^\y)-l-5j^W(^\s) ds 

ify = oo, 



F(A(oo) G da,R(oo) G dr) = e'^^^^'^ rfa rfvrfa + r) = — tP-: e"*^^)'^ rfa rfvrfa + r). 

1 — 65(00) (p{d) — 

5 A (more general) model of epidemics 

As in fiOi H], we aim to model the spread of some antibiotic resistant bacteria like MRSA 
(Methicillin-resistant Staphylococcus aureus) in a hospital. Once in a while, a patient is 
colonised by MRSA (presumably by introduction from outside) and this may cause an out- 
break in the hospital. This outbreak is only detected at the first time T when one of the 
carriers declares herself, i.e., when the first symptoms appear in a carrier, or at the first 
positive medical exam of a carrier. 
We assume that 

• patients have i.i.d lengths of stay in the hospital, all distributed as some positive 
random variable K with finite expectation; 

• the outbreak starts with the infection of a randomly chosen patient; 

• the length of stay is not influenced by whether or not an individual carries MRSA 
(neutrality, or exchangeability assumption); 

• during an outbreak no further introductions from outside occur (no immigration); 

• carriers are infective from the first time they were infected till their departure from the 
hospital; 
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• while infective, patients independently transmit MRSA to other individuals at times 
of a Poisson process with parameter b (susceptible individuals are always assumed to 
be in excess, so that effects of the finite size of the hospital are ignored); 

• as a consequence of the renewal theorem (assuming stationarity of the regenerative 
set of arrivals at the hospital), the length of stay of a patient conditional on infection 
is a size-biased version of K, and the time at which she is infected is independent, 
uniformly distributed during her stay; 

• each patient can be detected to be a carrier only after an independent exponential time 
with parameter 5 running from the beginning of her infection (time of screening or of 
developing symptoms in this patient). The first time T when a carrier is detected is 
called detection time; 

• At detection time, all patients in the hospital are screened with a perfect test, so all 
carriers at T are immediately identified. 

Remark 4. The second assumption can be disputable, since MRSA is often introduced by a 
patient who already carries MRSA before entering the hospital (personal communication with 
Martin Bootsma). Changing this assumption on introduction of MRSA for a more realistic 
one would make the analysis harder, although possible, and obscure the illustrative character 
of the example provided in this section. 

It is not possible to obtain useful data from patients who already left the hospital at 
the moment of detection. Indeed, most carriers leaving the hospital will soon lose MRSA 
because - in the absence of antibiotic pressure - the antibiotic resistant strains will soon be 
outcompeted by antibiotic susceptible strains. Thus, our goal is to infer the parameters of 
the epidemics by using available medical data belonging to the detected carriers. 

Thus the model is a Crump-Mode-Jagers branching process where every birth event is 
interpreted as an infection, and individuals are endowed with i.i.d. bivariate r.v. distributed 
as the pair ([/, V), with V the lifetime (as an infective), i.e., the time between infection and 
departure from the hospital, and U the time already spent in the hospital before infection. 
Individuals "give birth" at constant rate b during their (infective) lifetime (length V) to 
copies of themselves. Finally, the joint law of (C/, V) is given by 

E{f{U,V)) = m-^ [ ¥{Kedz)[ dxf{x,z-x), (13) 
where m := E,[K) and / is any non-negative Borel function. 



21 



At detection time T, all carriers i = 1, . . . , Nt are identified and we focus on the following 
medical data belonging to them 

• Ui is the time already spent in the hospital by carrier i upon her infection; 

• Ai is the time elapsed between infection of carrier i and T ('age' of infection) ; 

• Ri is the remaining length of stay of carrier i in the hospital after T ('residual lifetime' 
of infection); 

• Vi := Ai + Ri is the total infective lifetime of carrier i; 

• Hi := Ui + Ai is the time elapsed between entrance in the hospital of carrier i and time 
T. 

Note that {Ui, Vi) is merely the typical pair (f/, V) attached to carrier i, and that Ai and Ri 
have the interpretations given in the previous section. The quantities of empirical interest are 
the random variables Hi, which should be easy to obtain from the hospital administrations. 
Also, the distribution of K should be easy to estimate from hospital data. 

It is not difficult to see that with this extra information, Proposition [11] still holds with 
/i (and hence ip, 0, W^^\...) defined thanks to ffT^ as 

/i(dx) := F{V G dx) = mr^ P(fs: > x) dx. (14) 

Actually, we also have the following straightforward extension of Proposition [TTl 

Corollary 12. Conditional on {N^ = n}, the triples {Ui, Ai, Ri) of the n (randomly labelled) 
carriers at time T are i.i.d., distributed as the r.v. {U,A,R) (independent ofn), where 



E{f{U, A, R)) = - -7^^ r du r da H ¥{K e dz) e''^^'^)'^ 

m 0((^) - 6 J„=o Ja=0 Jz=u+a 



f{u, a, z — u — a). 



In particular, the times Hi = Ui + Ai spent in the hospital up to time T are i.i.d., distributed 
as the r.v. H 

F{H G dy) = F{K > y) {l - e'^^'^y) dy. (15) 

Remark 5. From the definition of (f){a) we deduce that 

r°° h 1 

S = (f){5)-b fi{dx){l - e-^^^>) ^ 



and ( [73]) might be rewritten as 



0(5) -5 J,^Kdx){l 



r{H G dy) = ,oc . , -TTTT >y)(l- e-^^"^)^) dy. 

^ ^' /o /i(rfx)(l-e-'^('^)^) ^ ^'^ ' ^ 
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Finally filling in gives 

¥(K >y)(l- e-<^(^)3') dy , , 

V yj /g°°P(fs: > x)(l -e-'^Wx)^^, ^ ^ 

r/ie r/is depends on K (which might be estimated from independent hospital data) and (j){6) 
only. 

Now assume that various outbreaks in various hospitals are observed at their detection 
times. If the sizes of outbreaks (all distributed as Nt) are the only observable statistics, 
then, as stressed in |H |T6], the fact that Nt is geometrically distributed only allows for 
the estimation of a single epidemiological parameter. Enlarging this information to, e.g., 
the times Hi spent in the hospital before T, we can hope to make finer inferences on the 
dynamical characteristics of those epidemics. 

Assume that n outbreaks are observed of sizes Xi,X2, - ■ ■ ,Xn G N>o and s{n) := X]r=i -^^ 
carriers are detected, which at the time of detection have been in the hospital for yi,y2, ■ ■ ■ , ys{n) ^ 
]R+ time units. We also assume that, since the distribution of K may be estimated from 
independent hospital data, its distribution is known exactly. 

Using Remark [5] the likelihood of the observations L{b,6;xi,--- ,Xn,yi,--- ys(n)) is given 

by 




.(n)-n > y,) {1 - e'^('^y^) dy, 

^<P{5)J y <P{5)) ^}J^ ¥{K > x){l - e-^(^>)dx ^ ' 



We write L = L1L2, where 

Li{b,6) 

and 



s{n)—n 



r- 



\ J^F{K > x)(l - e-*W^)rfx' 

We observe that Li is the likelihood function for n realisations xi, X2, ■ ■ ■ , of i.i.d geometric 
random variables with parameter gi := gi{b,6) := 6/(j){6), while L2 (assuming that the 
distribution of K is exactly known) only depends on g2 := g2{b,6) := (f){S). Observe that 
S = 9192 and (recalling m := K{K)), 

, ^ H^) - S ^ mg2{l-gi) 

fiidx)il - e-^W-) F{K>x)il- e-9^-)dx 
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Furthermore, reparametrization of Li{b,S) in a function of gi and g2 results in a function 
which is independent of g2, while reparametrization of L2{b,6) in a function of gi and g2 
results in a function which is independent of gi. It is straightforward to deduce that the 
maximum likelihood estimator (MLE) of gi, say gi, is given by 

n 

91 



s{n) ' 

while, since Li does not depend on g2, the MLE of g2, say §2, is given by 



g2 = argmax 



n 



92 J-jL /~P(K > x)(l - e-92^)rfx' 



Standard theory on maximum likelihood estimation gives that the MLEs of b, say b, and 5, 
say S, are given by 

and d = 5(15(2. 



If is exponentially distributed with parameter u IHHS], then gi = n/s{n), while 

s(n) 



52 = argmax 



92 V C/2 



-92% N p-i^W 



argmax I s(?t,) log H J + ^ log (l — e^^^^^) 



and the MLE of b and 5 are given by 6 = (1 — gi){i' + 52) and 5 = 5152- 

Note that it is possible to allow for differences in the distributions of lengths of stay 
(the random variable K) and infection rates (the parameter b) for different hospitals, while 
keeping the biologically governed rate of onset of symptoms (5) the same for all hospitals. 
In that case we use the likelihood fll7p with hospital specific parameters and observations, 
for estimation. 

Derivation of similar formulae for models relaxing too simplistic assumptions (see Remark 
H]), and applications to real hospital data, will be addressed in a future work. 
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